AITopics | recursion formula

Collaborating Authors

recursion formula

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Convergence Analysis of Natural Gradient Descent for Over-parameterized Physics-Informed Neural Networks

Xu, Xianliang, Du, Ting, Kong, Wang, Li, Ye, Huang, Zhongyi

arXiv.org Artificial IntelligenceAug-6-2024

First-order methods, such as gradient descent (GD) and stochastic gradient descent (SGD), have been proven effective in training neural networks. In the context of over-parameterization, there is a line of work demonstrating that randomly initialized (stochastic) gradient descent converges to a globally optimal solution at a linear convergence rate for the quadratic loss function. However, the learning rate of GD for training two-layer neural networks exhibits poor dependence on the sample size and the Gram matrix, leading to a slow training process. In this paper, we show that for the $L^2$ regression problems, the learning rate can be improved from $\mathcal{O}(\lambda_0/n^2)$ to $\mathcal{O}(1/\|\bm{H}^{\infty}\|_2)$, which implies that GD actually enjoys a faster convergence rate. Furthermore, we generalize the method to GD in training two-layer Physics-Informed Neural Networks (PINNs), showing a similar improvement for the learning rate. Although the improved learning rate has a mild dependence on the Gram matrix, we still need to set it small enough in practice due to the unknown eigenvalues of the Gram matrix. More importantly, the convergence rate is tied to the least eigenvalue of the Gram matrix, which can lead to slow convergence. In this work, we provide the convergence analysis of natural gradient descent (NGD) in training two-layer PINNs, demonstrating that the learning rate can be $\mathcal{O}(1)$, and at this rate, the convergence rate is independent of the Gram matrix.

gradient descent, neural network, probability, (14 more...)

arXiv.org Artificial Intelligence

2408.00573

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Polynomial convergence of iterations of certain random operators in Hilbert space

Ghosh, Soumyadip, Lu, Yingdong, Nowicki, Tomasz J.

arXiv.org Artificial IntelligenceSep-1-2022

We prove the polynomial convergence rate of the average of the sequence which is explicitly determined only by the regularity of the initial state, Theorem 1 and 2. For convergence of the second moment, under a condition on the regularity of the random distribution (Assumption (A)), the convergence rate remains the same, Theorem 3. In another words, under (A), the regularity of the random sequence only affects the coefficient not the order of the polynomial convergence. Additionally we demonstrate almost sure convergence of the sequence, Theorem 4.

convergence, operator, sequence, (15 more...)

arXiv.org Artificial Intelligence

2202.02266

Country: North America > United States (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Analyze and Design Network Architectures by Recursion Formulas

Liao, Yilin, Wang, Hao, Liu, Zhaoran, Li, Haozhe, Liu, Xinggao

arXiv.org Artificial IntelligenceAug-18-2021

The effectiveness of shortcut/skip-connection has been widely verified, which inspires massive explorations on neural architecture design. This work attempts to find an effective way to design new network architectures. it is discovered that the main difference between network architectures can be reflected in their recursion formulas. Based on this, a methodology is proposed to design novel network architectures from the perspective of mathematical formulas. Afterwards, a case study is provided to generate an improved architecture based on ResNet. Furthermore, the new architecture is compared with ResNet and then tested on ResNet-based networks. Massive experiments are conducted on CIFAR and ImageNet, which witnesses the significant performance improvements provided by the architecture.

architecture, new architecture, recursion formula, (11 more...)

arXiv.org Artificial Intelligence

2108.08689

Country: Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback